On-line experimental methods to evaluate text-to-speech (TTS) synthesis: effects of voice gender and signal quality on intelligibility, naturalness and preference
نویسندگان
چکیده
Three experiments are reported that use new experimental methods for the evaluation of text-to-speech (TTS) synthesis from the user’s perspective. Experiment 1, using sentence stimuli, and Experiment 2, using discrete ‘‘call centre’’ word stimuli, investigated the effect of voice gender and signal quality on the intelligibility of three concatenative TTS synthesis systems. Accuracy and search time were recorded as on-line, implicit indices of intelligibility during phoneme detection tasks. It was found that both voice gender and noise affect intelligibility. Results also indicate interactions of voice gender, signal quality, and TTS synthesis system on accuracy and search time. In Experiment 3 the method of paired comparisons was used to yield ranks of naturalness and preference. As hypothesized, preference and naturalness ranks were influenced by TTS system, signal quality and voice, in isolation and in combination. The pattern of results across the four dependent variables – accuracy, search time, naturalness, preference – was consistent. Natural speech surpassed synthetic speech, and TTS system C elicited relatively high scores across all measures. Intelligibility, judged naturalness and preference are modulated by several factors and there is a need to tailor systems to particular commercial applications and environmental conditions. 2004 Elsevier Ltd. All rights reserved. * Corresponding author. Tel.: +61-2-9772-6324; fax: +61-2-9772-6736. E-mail address: [email protected] (C. Stevens). 0885-2308/$ see front matter 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.csl.2004.03.003 130 C. Stevens et al. / Computer Speech and Language 19 (2005) 129–146
منابع مشابه
Experimental tools to evaluate intelligibility of text-to-speech (TTS) synthesis: effects of voice gender and signal quality
Two experiments are reported that constitute new methods for evaluation of text-to-speech (TTS) synthesis from the user’s perspective. Experiment 1, using sentence stimuli, and Experiment 2, using discrete word stimuli, investigate the effect of voice gender and signal quality on the intelligibility of three TTS synthesis systems from the user’s point of view. Accuracy scores and reaction time ...
متن کاملImplicit Measurement of Intelligibility of Male and Female Voice Text-to-speech (tts) Synthesis in Noise Using a Phoneme Detection Task
ABSTRACT: Given the increasing application of TTS synthesis in commercial and clinical settings, there is a need to develop methods of evaluation from the user’s perspective. An experiment is reported that compares the effect of two factors, voice gender and signal quality, on the intelligibility of three TTS systems from the user’s point of view. It was hypothesised that male voiced TTS would ...
متن کاملThe new version of the ROMVOX text-to-speech synthesis system based on a hybrid time domain-LPC synthesis technique
Through the years we developed several TTS systems for the Romanian language, each of them presenting some advantages and disadvantages [2]. Taking into account that waveform coding (time domain) methods assures a maximum level of intelligibility and naturalness of the synthesized speech, and that prosodic effects superimposing requires the alteration of pitch (frequency domain), we developed a...
متن کاملForeign Accents in Synthetic Speech: Development and Evaluation
This paper addresses the generation and evaluation of foreign-accented speech in concatenative text-to-speech (TTS) synthesis. We describe three possible methods of building a Spanish-accented English voice, and evaluate and compare them with respect to preference, intelligibility, and smoothness. Effects of speaking rate and content are also examined. It is found that although using an unmodif...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Speech & Language
دوره 19 شماره
صفحات -
تاریخ انتشار 2005